Skip to content

Add text preprocessing and dialogue generation for TTS#2

Merged
razumau merged 3 commits intomainfrom
claude/review-app-improvements-Eso8v
Mar 2, 2026
Merged

Add text preprocessing and dialogue generation for TTS#2
razumau merged 3 commits intomainfrom
claude/review-app-improvements-Eso8v

Conversation

@razumau
Copy link
Owner

@razumau razumau commented Feb 20, 2026

Summary

This PR adds comprehensive text preprocessing capabilities and LLM-based dialogue generation to improve audio quality for text-to-speech conversion. It introduces multiple preprocessing modes, a new Dia TTS model integration, and refactors file handling to use temporary files for better concurrency.

Key Changes

New Preprocessing System

  • preprocess.py: New module with regex-based text cleaning for TTS consumption

    • Removes URLs, emails, citation markers, code blocks, and markdown formatting
    • Expands abbreviations (e.g., "e.g." → "for example") and numbers to words
    • Handles currency, percentages, and year detection
    • Removes HTML entities and normalizes whitespace
  • llm_preprocess.py: Claude-based article rewriting for natural audio narration

    • Converts written text to spoken language with proper flow
    • Removes visual references and technical elements

Bot Integration

  • bot.py: Added preprocessing mode selection via /setpreprocess command
    • Four modes: "none", "regex", "llm"
    • Applies selected preprocessing before TTS generation
    • Displays preprocessing mode in episode descriptions

File Handling Improvements

  • extract_article.py: Refactored to use temporary files instead of fixed filenames

    • Prevents concurrency issues when multiple requests are processed
    • Properly cleans up temp files in finally block
  • models/kokoro.py: Updated to use temporary files for concat operations

    • Improved cleanup logic with existence checks
  • models/eleven.py: Added model selection support with AVAILABLE_MODELS dict

Dependencies

  • Added num2words>=0.5.14 for number-to-word conversion
  • Updated kokoro>=0.9.4 for improved compatibility
  • Added anthropic (implicit via llm modules) for Claude API access
  • Removed spacy<=3.7.3 dependency

Testing

  • tests/test_preprocess.py: Comprehensive test suite for preprocessing pipeline
    • Tests individual cleaning functions and full pipeline
    • Validates abbreviation expansion, number conversion, and markdown removal

Notable Implementation Details

  • Temporary files use unique suffixes to avoid collisions in concurrent scenarios
  • Claude Haiku model used for cost-effective LLM preprocessing

https://claude.ai/code/session_01UwZwqWxsZcVMcecNCfWCxj

claude and others added 3 commits February 20, 2026 18:21
…tness fixes

Major improvements to the TTS podcast pipeline:

- Add regex-based text preprocessing (preprocess.py): removes URLs, code
  blocks, citations, markdown artifacts; expands numbers, abbreviations,
  and currency to spoken words using num2words
- Add LLM-based preprocessing modes: "llm" rewrites articles for natural
  audio narration, "dialogue" generates two-speaker podcast scripts for
  use with the Dia model (both via Claude Haiku)
- Add Dia TTS backend (models/dia_tts.py) for multi-speaker podcast
  generation using [S1]/[S2] speaker tags
- Upgrade Kokoro: bump from v0.7.16 to v0.9.4+, expand voice list from
  4 to 12 (American + British English), add explicit repo_id
- Upgrade ElevenLabs: output format from 32kbps to 128kbps, add eleven_v3
  model option, include model ID in metadata
- Add /setpreprocess Telegram command (none/regex/llm/dialogue modes)
- Fix crash when extract_webpage_content returns None (bot.py)
- Fix temp file concurrency: use tempfile for concat.txt, article files
- Fix WAV file cleanup: always clean up in finally block
- Remove unused html_fetcher.py and spacy dependency

https://claude.ai/code/session_01UwZwqWxsZcVMcecNCfWCxj
@razumau razumau force-pushed the claude/review-app-improvements-Eso8v branch from e98b950 to 7cf8db1 Compare March 1, 2026 22:31
@razumau razumau merged commit dbd1400 into main Mar 2, 2026
1 check passed
@razumau razumau deleted the claude/review-app-improvements-Eso8v branch March 2, 2026 20:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants